On Finding Rank Regret Representatives
نویسندگان
چکیده
Selecting the best items in a dataset is common task data exploration. However, concept of “best” lies eyes beholder: Different users may consider different attributes more important and, hence, arrive at rankings. Nevertheless, one can remove “dominated” and create “representative” subset data, comprising “best items” it. A Pareto-optimal representative guaranteed to contain item each possible ranking, but it be large portion data. much smaller found if we relax requirement including for user instead just limit users’ “regret.” Existing work defines regret as loss score by limiting consideration full dataset, any chosen ranking function. often not meaningful number, understand its absolute value. Sometimes small ranges include fractions dataset. In contrast, do notion rank ordering. Therefore, items’ positions ranked list defining propose rank-regret minimal containing least top- k This problem polynomial time solvable two-dimensional space NP-hard on three or dimensions. We design suite algorithms fulfill purposes, such whether relaxation permitted , result size, both, distribution known, theoretical guarantees practical efficiency important, so on. Experiments real datasets demonstrate that efficiently find subsets with rank-regrets.
منابع مشابه
RRR: Rank-Regret Representative
Selecting the best items in a dataset is a common task in data exploration. However, the concept of “best” lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove “dominated” items and create a “representative” subset of the data set, comprising the “best items” in it. A Pareto-optim...
متن کاملFinding Diverse, High-Value Representatives on a Surface of Answers
In many applications, the system needs to selectively present a small subset of answers to users. The set of all possible answers can be seen as an elevation surface over a domain, where the elevation measures the quality of each answer, and the dimensions of the domain correspond to attributes of the answers with which similarity between answers can be measured. This paper considers the proble...
متن کاملBestTime: Finding Representatives in Time Series Datasets
Given a set of time series, we aim at finding representatives which best comprehend the recurring temporal patterns contained in the data. We demonstrate BestTime, a Matlab application that uses recurrence quantification analysis to find time series representatives.
متن کاملFinding representatives in a large dataset of spectral reflectances
We propose a new method to construct representative spectra from a large database of spectral reflectances. The key is the optimisation of a Support Vector type functional. The representatives are constructed such that they sit at positions of high density in the set of spectra. At the same time they are constructed to be as orthogonal as possible. The representatives are expressible as a linea...
متن کاملFinding Minimal Length Representatives in Thompson’s Group F
Cleary and Taback devised a method called the nested traversal method to construct minimal length representatives for positive and negative elements in Thomspson’s group. We show how to use the nested traversal method to construct minimal length representatives for a larger class of elements of this
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Database Systems
سال: 2022
ISSN: ['1557-4644', '0362-5915']
DOI: https://doi.org/10.1145/3531054